{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "aXeleRate_mark_detector.ipynb",
      "provenance": [],
      "private_outputs": true,
      "collapsed_sections": [],
      "mount_file_id": "1tDQwRgaEZqe_E-7g2kgi9QQ9FNl6e_2w",
      "authorship_tag": "ABX9TyOs98RKXDUIV15/zUMc2z4p",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/AIWintermuteAI/aXeleRate/blob/master/resources/aXeleRate_mark_detector.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hS9yMrWe02WQ",
        "colab_type": "text"
      },
      "source": [
        "## M.A.R.K. Detection model Training and Inference\n",
        "\n",
        "In this notebook we will use axelerate, Keras-based framework for AI on the edge, to quickly setup model training and then after training session is completed convert it to .tflite and .kmodel formats.\n",
        "\n",
        "First, let's take care of some administrative details. \n",
        "\n",
        "1) Before we do anything, make sure you have choosen GPU as Runtime type (in Runtime - > Change Runtime type).\n",
        "\n",
        "2) We need to mount Google Drive for saving our model checkpoints and final converted model(s). Press on Mount Google Drive button in Files tab on your left. \n",
        "\n",
        "In the next cell we clone axelerate Github repository and import it. \n",
        "\n",
        "**It is possible to use pip install or python setup.py install, but in that case you will need to restart the enironment.** Since I'm trying to make the process as streamlined as possibile I'm using sys.path.append for import."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "y07yAbYbjV2s",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "%tensorflow_version 1.x\n",
        "!git clone https://github.com/AIWintermuteAI/aXeleRate.git\n",
        "import sys\n",
        "sys.path.append('/content/aXeleRate')\n",
        "from axelerate import setup_training,setup_inference"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5TBRMPZ83dRL",
        "colab_type": "text"
      },
      "source": [
        "At this step you typically need to get the dataset. You can use !wget command to download it from somewhere on the Internet or !cp to copy from My Drive as in this example\n",
        "```\n",
        "!cp -r /content/drive/'My Drive'/pascal_20_segmentation.zip .\n",
        "!unzip --qq pascal_20_segmentation.zip\n",
        "```\n",
        "Dataset preparation and postprocessing are discussed in the article here:\n",
        "\n",
        "The annotation tool I use is LabelImg\n",
        "https://github.com/tzutalin/labelImg\n",
        "\n",
        "Let's visualize our detection model test dataset. There are images in validation folder with corresponding annotations in PASCAL-VOC format in validation annotations folder.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_tpsgkGj7d79",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!gdown https://drive.google.com/uc?id=1s2h6DI_1tHpLoUWRc_SavvMF9jYG8XSi #dataset\n",
        "!gdown https://drive.google.com/uc?id=1-bDRZ9Z2T81SfwhHEfZIMFG7FtMQ5ZiZ #pre-trained model\n",
        "\n",
        "!unzip --qq mark_dataset.zip\n",
        "\n",
        "from axelerate.networks.yolo.backend.utils.augment import visualize_dataset\n",
        "\n",
        "visualize_dataset(img_folder='mark_detection/imgs_validation', ann_folder='mark_detection/ann_validation', num_imgs=10, img_size=224, jitter=True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "S1oqdtbr7VLB",
        "colab_type": "text"
      },
      "source": [
        "Next step is defining a config dictionary. Most lines are self-explanatory.\n",
        "\n",
        "Type is model frontend - Classifier, Detector or Segnet\n",
        "\n",
        "Architecture is model backend (feature extractor) \n",
        "\n",
        "- Full Yolo\n",
        "- Tiny Yolo\n",
        "- MobileNet1_0\n",
        "- MobileNet7_5 \n",
        "- MobileNet5_0 \n",
        "- MobileNet2_5 \n",
        "- SqueezeNet\n",
        "- VGG16\n",
        "- ResNet50\n",
        "\n",
        "For more information on **anchors**, please read here\n",
        "https://github.com/pjreddie/darknet/issues/568\n",
        "\n",
        "**Labels** are labels present in your dataset.\n",
        "IMPORTANT: Please, list all the labels present in the dataset.\n",
        "\n",
        "**object_scale** determines how much to penalize wrong prediction of confidence of object predictors\n",
        "\n",
        "**no_object_scale** determines how much to penalize wrong prediction of confidence of non-object predictors\n",
        "\n",
        "**coord_scale** determines how much to penalize wrong position and size predictions (x, y, w, h)\n",
        "\n",
        "**class_scale** determines how much to penalize wrong class prediction"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EkASgMdcj3Nu",
        "colab_type": "text"
      },
      "source": [
        "## Parameters for Person Detection\n",
        "\n",
        "K210, which is where we will run the network, has constrained memory (5.5 RAM) available, so with Micropython firmware, the largest model you can run is about 2 MB, which limits our architecture choice to Tiny Yolo, MobileNet(up to 0.75 alpha) and SqueezeNet. Out of these 3 architectures, only one comes with pre-trained model - MobileNet. So, to save the training time we will use Mobilenet with alpha 0.75, which has ... parameters. For objects that do not have that much variety, you can use MobileNet with lower alpha, down to 0.25."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Jw4q6_MsegD2",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "config = {\n",
        "        \"model\":{\n",
        "            \"type\":                 \"Detector\",\n",
        "            \"architecture\":         \"MobileNet5_0\",\n",
        "            \"input_size\":           224,\n",
        "            \"anchors\":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],\n",
        "            \"labels\":               [\"mark\"],\n",
        "            \"coord_scale\" : \t\t1.0,\n",
        "            \"class_scale\" : \t\t1.0,\n",
        "            \"object_scale\" : \t\t5.0,\n",
        "            \"no_object_scale\" : \t1.0\n",
        "        },\n",
        "        \"weights\" : {\n",
        "            \"full\":   \t\t\t\t\"\",\n",
        "            \"backend\":   \t\t    \"imagenet\"\n",
        "        },\n",
        "        \"train\" : {\n",
        "            \"actual_epoch\":         50,\n",
        "            \"train_image_folder\":   \"mark_detection/imgs\",\n",
        "            \"train_annot_folder\":   \"mark_detection/ann\",\n",
        "            \"train_times\":          1,\n",
        "            \"valid_image_folder\":   \"mark_detection/imgs_validation\",\n",
        "            \"valid_annot_folder\":   \"mark_detection/ann_validation\",\n",
        "            \"valid_times\":          1,\n",
        "            \"valid_metric\":         \"mAP\",\n",
        "            \"batch_size\":           32,\n",
        "            \"learning_rate\":        1e-3,\n",
        "            \"saved_folder\":   \t\tF\"/content/drive/My Drive/mark_detector\",\n",
        "            \"first_trainable_layer\": \"\",\n",
        "            \"augumentation\":\t\t\t\tTrue,\n",
        "            \"is_only_detect\" : \t\tFalse\n",
        "        },\n",
        "        \"converter\" : {\n",
        "            \"type\":   \t\t\t\t[\"k210\",\"tflite\"]\n",
        "        }\n",
        "    }"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kobC_7gd5mEu",
        "colab_type": "text"
      },
      "source": [
        "Let's check what GPU we have been assigned in this Colab session, if any."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rESho_T70BWq",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.python.client import device_lib\n",
        "device_lib.list_local_devices()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cWyKjw-b5_yp",
        "colab_type": "text"
      },
      "source": [
        "Finally we start the training by passing config dictionary we have defined earlier to setup_training function. The function will start the training with Checkpoint, Reduce Learning Rate on Plateau and Early Stopping callbacks. After the training has stopped, it will convert the best model into the format you have specified in config and save it to the project folder."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "deYD3cwukHsj",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from keras import backend as K \n",
        "K.clear_session()\n",
        "model_path = setup_training(config_dict=config)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ypTe3GZI619O",
        "colab_type": "text"
      },
      "source": [
        "After training it is good to check the actual perfomance of your model by doing inference on your validation dataset and visualizing results. This is exactly what next block does. Obviously since our model has only trained on a few images the results are far from stellar, but if you have a good dataset, you'll have better results."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jE7pTYmZN7Pi",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from keras import backend as K \n",
        "K.clear_session()\n",
        "setup_inference(config, model_path)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5YuVe2VD11cd",
        "colab_type": "text"
      },
      "source": [
        "My end results are:\n",
        "\n",
        "{'fscore': 0.942528735632184, 'precision': 0.9318181818181818, 'recall': 0.9534883720930233}\n",
        "\n",
        "**You can obtain these results by loading a pre-trained model.**\n",
        "\n",
        "Good luck and happy training! Have a look at these articles, that would allow you to get the most of Google Colab or connect to local runtime if there are no GPUs available;\n",
        "\n",
        "https://medium.com/@oribarel/getting-the-most-out-of-your-google-colab-2b0585f82403\n",
        "\n",
        "https://research.google.com/colaboratory/local-runtimes.html"
      ]
    }
  ]
}